1 Introduction

Seasonal Affective Disorder (SAD) is a mood disorder characterized by recurring episodes of depression that occur at specific times of the year, typically in autumn and winter. Seasonal variations in sunshine exposure are thought to be connected to SAD, impacting patients’ mental and physical health (Jack et al., 2023).
In Scotland, daylight hours fluctuate significantly throughout the year. SAD has been shown to affect 9.3% of primary care patients in northeast Scotland during winter, with an additional 15.6% experiencing subsyndromal symptoms (Eagles et al., 1998). Therefore, understanding the relationship between SAD and prescription trends of antidepressants is of public health interest.
This analysis focuses on one type of antidepressant, Selective Serotonin Re-uptake Inhibitors (SSRIs), which account for over 50% of all antidepressants prescribed in Scotland (NHS; Godman et al., 2019). The National Institute for Health and Care Excellence (NICE) recommends treating SAD similarly to other types of depression, including the use of antidepressants like SSRIs (NICE).

2 Methods

This report will use SSRI prescription patterns across 2022–2023, a period chosen to reflect post-COVID-19 conditions, during which antidepressant prescriptions may have been influenced not only by seasonal but also year-round factors, as SAD patients have been shown to consult frequently throughout the year (Eagles et al., 1998).
By exploring the potential correlation between SSRI prescription trends and regional sunshine hours, this analysis aims to shed light on the temporal and geographical factors influencing SAD-related prescriptions. These findings are intended to provide insights into the seasonal dynamics of mental health care in Scotland.

2.0.1 🔍 Data Sources

The following datasets were used for this analysis:

  1. NHS Scotland Prescription Data
  2. Sunshine Hours Data
    • Source: MET Office
    • Includes monthly sunshine hours across Scotland’s northern, eastern, and western regions for 2022–2023.
  3. NHS Health Board Shapefiles
  4. Scotland Population Data

2.1 Data Preparation

Set-up Environment
To start the analysis, the required libraries are loaded. These libraries include tools for data manipulation (tidyverse), file handling (readr), and visualization (ggplot2), among others:

# load libraries 
library(readr)       # Reading and writing data files
library(knitr)       # Dynamic report generation
library(tidyverse)   # Data manipulation and visualization
library(janitor)     # Data cleaning and formatting
library(dplyr)       # Data wrangling
library(here)        # File path management
library(forcats)     # Working with categorical variables
library(gt)          # Creating tables for reporting
library(sf)          # Handling spatial data
library(ggplot2)     # Data visualization
library(plotly)      # Interactive plots
library(cowplot)     # Combining plots into layouts
library(bookdown)    # Generating book-like reports

# Set global chunk options for consistent output in the report
knitr::opts_chunk$set(
  echo = TRUE,       # Display code alongside its output
  warning = FALSE,   # Suppress warning messages in the output
  message = FALSE,   # Suppress informational messages
  results = 'hide',  # Hide raw text results by default
  fig.align = "center" # Center-align figures
)

2.1.1 MET Daylight Hours Data

Function for Loading and Cleaning Sunshine Data:
The following function, load_sunshine_data(), is used to load and clean sunshine hours data from the MET Office. The function automates the extraction of relevant years and reshapes the data for analysis. It takes two arguments:
1. url: The URL of the MET data file.
2. years: A vector of years to filter the data.

load_sunshine_data <- function(url, years) {
raw_data <- read_lines(url)                    # Read the file as lines of text
data_lines <- raw_data[8:length(raw_data)] %>% # read raw data&start from 8th row (skip title)
  str_split_fixed("\\s+", 18) %>%              #split by white-space into 18 fixed width columns
  as_tibble()                                  #convert to a tibble for easier handling

# Assign meaningful column names for readability  
colnames(data_lines) <- c("Year", month.abb, "Winter", "Spring", "Summer", "Autumn", "Annual")

# Clean and filter data for analysis 
cleaned_data <- data_lines %>%
 mutate(across(everything(),as.numeric)) %>% # convert all columns to numeric type for calculations
  filter(Year %in% years) %>%                 # only keep rows for specified years
  select(Year, Jan:Dec) %>%                   # exclude seasonal aggregate columns (e.g. annual)

# pivot to long format for easier merging and analysis
   pivot_longer(
      cols = Jan:Dec,                         # reshape monthly columns into rows 
      names_to = "Month",                     # create new column for month names 
      values_to = "Sunshine Hours")           # create column for sunshine hours 

#return cleaned and reshaped data 
return(cleaned_data)
}

Load and Combine Regional Data:
This block uses the load_sunshine_data() function to load and clean sunshine hours data for Scotland’s northern, eastern, and western regions. The datasets are combined into a single table with an added region label for further analysis.

#load sunshine data for region (north, west and east), for 2022 and 2023
north <- load_sunshine_data("https://www.metoffice.gov.uk/pub/data/weather/uk/climate/datasets/Sunshine/date/Scotland_N.txt",c(2022,2023)) 
east <- load_sunshine_data("https://www.metoffice.gov.uk/pub/data/weather/uk/climate/datasets/Sunshine/date/Scotland_E.txt",c(2022,2023))
west <- load_sunshine_data("https://www.metoffice.gov.uk/pub/data/weather/uk/climate/datasets/Sunshine/date/Scotland_W.txt",c(2022,2023))

#Combine datasets and add region labels  
sunlight_data <- bind_rows(
  north %>% mutate(Region = "north"),
  east %>% mutate(Region = "east"),
  west %>% mutate(Region = "west")
  ) %>% 
  clean_names() # Standardize column names for consistency and easier use

2.1.2 Prescription Data

To analyze SSRI prescription trends across 2022-2023, I needed to process 24 large monthly datasets. Downloading and cleaning these files manually would be inefficient and error-prone. Instead, I automated the workflow by:

Step 1: Resource Mapping
This block imports a CSV file (resource_mapping.csv) containing the URLs of monthly SSRI prescription datasets. This mapping ensures efficient and reproducible access to data sources for batch processing.

# Import a CSV file mapping resource URLs to their respective prescription data
# This file provides the links to download monthly prescription datasets
#resource_mapping <- read_csv(here("data", "resource_mapping.csv"))

Step 2: Function to Process Monthly Data
This function processes monthly prescription data by downloading it from a specified URL, filtering for SSRIs, and summarizing the total prescriptions by health board. It also handles errors to ensure robustness during batch processing.

# Function to process monthly prescription data
# This function retrieves a CSV file for a specific year and month, filters the data for SSRIs,
# aggregates prescription counts by health board, and returns the summarized data.

# process_monthly_data <- function(resource_mapping, year, month) {
#     # Retrieve the URL for the specified year and month from the resource mapping
#     resource_url <- resource_mapping %>%
#         filter(year == !!year & month == !!month) %>%
#         pull(resource_url)  # Extract the resource URL
# 
# #     # Use tryCatch to handle errors gracefully during data processing
#     tryCatch({
# #         # Download and read the CSV data directly from the URL
#         monthly_data <- read_csv(resource_url, show_col_types = FALSE)
#         
# #         # Filter the data to include only SSRI prescriptions
#         ssri_data <- monthly_data %>%
#             filter(
#                 str_detect(BNFItemCode, "^040303") |  # Match SSRI BNF codes
#                 str_detect(BNFItemDescription,       # Match common SSRI drug names
#                     "citalopram|fluoxetine|sertraline|paroxetine|escitalopram")
#             ) %>%
#             group_by(HBT) %>%  # Summarize the data by health board
#             summarize(
#                 TotalPaidItems = sum(NumberOfPaidItems, na.rm = TRUE),  # Aggregate total prescriptions
#                 .groups = "drop"
#             ) %>%
#             mutate(Year = year, Month = month)  # Add year and month for reference
#         
#         # Return the processed SSRI data
#         return(ssri_data)
#     }, error = function(e) {
#         # Print an error message and return an empty tibble if an error occurs
#         message(sprintf("Failed to process data for %04d-%02d: %s", year, month, e$message))
#         return(tibble())  # Return an empty tibble on failure
#     })
# }

Step 3: Batch Processing for all Months
This block generates a grid of year-month combinations and uses the process_monthly_data() function to batch process all monthly datasets. The cleaned and combined data is saved for further analysis.

# # Generate a grid of year-month combinations for batch processing
# # This grid ensures that data for all specified months and years can be processed systematically.
# 
# years <- 2022:2023  # Define the range of years for data processing
# months <- 1:12      # Define the months (1-12) to include in the grid
# year_month_grid <- expand_grid(year = years, month = months)  # Create all possible year-month combinations
# 
# # Process all datasets and combine the results into one table
# # This block applies the `process_monthly_data()` function to each year-month combination
# # and collects the processed data into a single dataset.
# 
# all_prescription_data <- year_month_grid %>%
#   mutate(data = map2(year, month, ~ process_monthly_data(resource_mapping, .x, .y))) %>%  # Apply function for each combination
#   unnest(data)  # Combine all processed monthly datasets into one unified dataset

Validation
To ensure accuracy, I validated the processed data by checking for missing values, verifying URLs, and confirming the integrity of parsing operations. The final cleaned dataset is saved as processed_prescriptions.csv for reuse in further analysis.

- Parsing: No parsing issues were identified.  
- Missing Data: All rows contained complete prescription counts.  
- URL Verification: All resource URLs were valid and accessible. 

Final Cleaning and Saving

# # Standardize month names and clean up columns for merging with other datasets
# # This ensures that the month names are consistent and unnecessary columns are removed.
# all_prescription_data <- all_prescription_data %>%
#   mutate(month = month.abb[Month]) %>%  # Convert numeric month values to abbreviated names
#   select(-Year)  # Remove the Year column if it's no longer required
# 
# # Save the cleaned dataset as a CSV file for reuse in future analysis
# # Storing the processed data prevents the need to reprocess raw data repeatedly.
# write_csv(all_prescription_data, "data/processed_prescriptions.csv")

# Load the saved dataset for further analysis
# This reads in the previously saved processed data for continued analysis in the pipeline.
all_prescription_data <- read_csv(here("data/processed_prescriptions.csv")) %>% 
  clean_names()  # Standardize column names for easier handling

2.2 Wrangling Data

This block reads and cleans the NHS Health Board spatial data to enable mapping. It assigns regions (north, east, west) to each health board and prepares the data for merging with prescription datasets.

# Read and clean NHS Health Board spatial data
# This shapefile contains geographic boundaries and metadata for NHS Health Boards in Scotland.
NHS_healthboards <- st_read(here("data", "NHS_HealthBoards_2019.shp")) %>% 
  clean_names()  # Standardize column names for easier handling

# Assign regions to Health Boards based on their geographic location
# This step categorizes health boards into "north", "west", and "east" regions for analysis.
region_ssri_prescriptions <- all_prescription_data %>%
  left_join(NHS_healthboards, by = c("hbt" = "hb_code")) %>%  # Merge prescription data with spatial data
  mutate(region = case_when(  # Assign regions based on health board names
    hb_name %in% c("Highland", "Grampian", "Orkney", "Shetland", "Western Isles") ~ "north", 
    hb_name %in% c("Greater Glasgow and Clyde", "Ayrshire and Arran", "Lanarkshire", "Dumfries and Galloway") ~ "west", 
    hb_name %in% c("Lothian", "Fife", "Tayside", "Forth Valley", "Borders") ~ "east"
  )) %>%
  select(year, month, hbt, hb_name, region, total_paid_items)  # Retain only relevant columns for analysis

# Import and clean population data for health boards
# Population data is filtered for relevant years and used to calculate per-capita metrics.
population_data <- read_csv(here("data", "population_data.csv")) %>% 
  clean_names() %>%  # Standardize column names
  filter(year %in% c(2022, 2023), sex == "All", hb != "S92000003") %>%  # Filter for all genders and exclude total Scotland code
  select(year, hb, all_ages) %>%  # Retain only necessary columns
  rename(population = all_ages)  # Rename column for clarity

This block reads and cleans the NHS Health Board spatial data to enable mapping. It assigns regions (north, east, west) to each health board and prepares the data for merging with prescription datasets.

# Merge population data with prescription data
# This step combines the population data with prescription data to allow per-capita calculations.
final_prescription_data <- region_ssri_prescriptions %>%
  left_join(population_data, by = c("year", "hbt" = "hb")) %>%  # Match by year and health board codes
  mutate(prescriptions_per_capita = total_paid_items / population)  # Calculate prescriptions per capita

# Merge prescription data with sunshine data
# Adds regional sunshine data to the prescription dataset for combined analysis.
final_data <- final_prescription_data %>%
  left_join(sunlight_data, by = c("year", "month", "region")) %>%  # Match by year, month, and region
  select(-hbt)  # Remove health board code column as it's no longer needed

Summarize data for table and plot:
This block aggregates SSRI prescription data and sunshine hours by year and month for summary tables and visualizations.

# Summarize prescription and sunshine data for Scotland by year and month
scotland_summary <- final_data %>%
  group_by(year, month) %>%  # Group data by year and month
  summarize(
    total_prescriptions = sum(total_paid_items, na.rm = TRUE),  # Calculate total prescriptions
    avg_sunshine_hours = mean(sunshine_hours, na.rm = TRUE),   # Calculate average sunshine hours
    .groups = "drop"  # Ungroup the data after summarization
  ) %>%
  mutate(
    month = factor(month, levels = month.abb)  # Convert months to ordered factors (e.g., Jan, Feb)
  ) %>%
  arrange(year, month)  # Arrange the data by year and month for chronological order

Attach Geomtery for Maps:

# Add geographic information to the final dataset
# Merge the prescription and sunshine data with NHS Health Board spatial data for mapping
final_data_with_geometry <- final_data %>%
  left_join(NHS_healthboards, by = c("hb_name" = "hb_name"))  # Match health board names in both datasets

3 Results and Analysis

3.0.1 Figure 1: Summary Table

# Aggregate prescription and sunshine data by year, month, and season
scotland_seasonal_summary <- scotland_summary %>%
  mutate(
    # Assign seasons based on months
    season = case_when(
      month %in% c("Dec", "Jan", "Feb") ~ "Winter",  # Winter: December to February
      month %in% c("Mar", "Apr", "May") ~ "Spring", # Spring: March to May
      month %in% c("Jun", "Jul", "Aug") ~ "Summer", # Summer: June to August
      month %in% c("Sep", "Oct", "Nov") ~ "Autumn"  # Autumn: September to November
    )
  ) %>%
  group_by(year, season) %>%  # Group data by year and season
  summarize(
    total_prescriptions = sum(total_prescriptions, na.rm = TRUE),  # Total prescriptions for each season
    avg_sunshine_hours = mean(avg_sunshine_hours, na.rm = TRUE),   # Average sunshine hours for each season
    .groups = "drop"  # Ungroup after summarization
  )

# Identify the maximum number of prescriptions for each year
max_prescriptions <- scotland_seasonal_summary %>% 
  group_by(year) %>%  # Group by year
  summarize(
    max_prescriptions = max(total_prescriptions, na.rm = TRUE)  # Compute maximum seasonal prescriptions per year
  )

This figure summarizes the total SSRI prescriptions and average sunshine hours for each season in 2022 and 2023. It highlights key metrics, including the maximum prescriptions for each year and the lowest sunshine hours, to identify seasonal trends in mental health treatment.

# Create a formatted table summarizing seasonal data for prescriptions and sunshine hours
seasonal_summary_gt <- scotland_seasonal_summary %>%
  gt(rowname_col = "season") %>%  # Use the "season" column as row labels

  # Group rows by year for a clear distinction between data from 2022 and 2023
  tab_row_group(label = md("**2022**"), rows = year == 2022) %>%
  tab_row_group(label = md("**2023**"), rows = year == 2023) %>%

  # Add a title, subtitle, and footnote to provide context and details about the table
  tab_header(
    title = md("**Seasonal Summary: Prescriptions and Sunshine Hours**"),
    subtitle = md("Tracking *SSRI^1^* Prescription Trends Alongside Sunshine Patterns in Scotland")
  ) %>%
  tab_footnote(
    footnote = "SSRI: Selective Serotonin Reuptake Inhibitors."
  ) %>%
  tab_source_note(
    source_note = "Data sources: Sunshine hours from Meteorological Office; SSRI prescriptions from NHS Scotland."
  ) %>%

  # Format the numerical columns for better readability
  fmt_number(columns = "total_prescriptions", decimals = 0) %>%  # No decimals for total prescriptions
  fmt_number(columns = "avg_sunshine_hours", decimals = 2) %>%   # Two decimals for average sunshine hours

  # Hide the "year" column since it’s already represented in the row grouping
  cols_hide(columns = "year") %>%

  # Adjust column labels for clarity and visual appeal
  cols_label(
    season = "Season",
    total_prescriptions = md("**Total Prescriptions**"),
    avg_sunshine_hours = md("**Avg. Sunshine Hours**")
  ) %>%

  # Add a spanner to group metrics into a "Seasonal Metrics" category
  tab_spanner(
    label = md("**Seasonal Metrics (2022-2023)**"),
    columns = c(total_prescriptions, avg_sunshine_hours)
  ) %>%

  # Align numerical columns to the right for a professional look
  tab_style(
    style = cell_text(align = "right"),
    locations = cells_body(columns = c(total_prescriptions, avg_sunshine_hours))
  ) %>%

  # Highlight specific rows where average sunshine hours are below 50
  tab_style(
    style = cell_fill(color = "#F7BCAC"),  # Use a light color for low sunshine hours
    locations = cells_body(
      columns = avg_sunshine_hours,
      rows = avg_sunshine_hours < 50
    )
  ) %>%

  # Highlight rows with the maximum prescriptions for each year
  tab_style(
    style = cell_fill(color = "#F37A48"),  # Use a bold color to emphasize maximum values
    locations = cells_body(
      columns = total_prescriptions,
      rows = total_prescriptions %in% max_prescriptions$max_prescriptions
    )
  ) %>%

  # Add alternate row striping for better readability
  tab_style(
    style = cell_fill(color = "#FAF5EF"),  # Use a subtle background color for alternate rows
    locations = cells_body(rows = seq(1, nrow(scotland_seasonal_summary), 2))
  ) %>%

  # Highlight year group rows with bold, contrasting styles for better sectioning
  tab_style(
    style = list(
      cell_fill(color = "#B7410E"),  # Bold background color for group rows
      cell_text(color = "white", weight = "bold")  # White, bold text for contrast
    ),
    locations = cells_row_groups(groups = everything())
  ) %>%

  # Adjust the table font size and row spacing for a cleaner layout
  tab_options(
    table.font.size = "small",           # Reduce font size for compactness
    data_row.padding = px(5)            # Adjust padding for better spacing
  )

# Display the formatted table
seasonal_summary_gt
Seasonal Summary: Prescriptions and Sunshine Hours
Tracking SSRI1 Prescription Trends Alongside Sunshine Patterns in Scotland
Seasonal Metrics (2022-2023)
Total Prescriptions Avg. Sunshine Hours
2023
Autumn 1,030,282 81.71
Spring 1,043,098 142.48
Summer 1,032,971 163.30
Winter 993,443 40.10
2022
Autumn 1,012,044 75.12
Spring 1,028,739 141.39
Summer 1,013,794 153.37
Winter 979,260 45.18
Data sources: Sunshine hours from Meteorological Office; SSRI prescriptions from NHS Scotland.
SSRI: Selective Serotonin Reuptake Inhibitors.

Figure 1 Analysis

The seasonal summary table reveals distinct temporal trends in SSRI prescriptions. While sunshine hours follow predictable seasonal patterns (lowest in winter, highest in summer), prescription rates peak in Spring 2023, likely reflecting a lag effect of winter light deprivation (Eagles et al., 1998; Lansdall-Welfare et al., 2019).
Conversely, winter prescriptions are lower than expected, challenging the assumption that winter is the primary SAD treatment period. The spring peak suggests that symptoms persist beyond winter, leading to delayed treatment-seeking behavior.
To better understand these trends, a deeper examination of monthly patterns is necessary, as shown in Figure 2.

3.0.2 Figure 2: Monthly Plot

This plot visualizes the monthly SSRI prescription trends in Scotland for 2022 and 2023. The figure highlights key peaks in March, August, and November and explores the lagging effects of winter sunlight deprivation and non-seasonal stressors.

# Add a hover text column for better tooltips in the interactive plot
# This text will display detailed information when hovering over points in the plot
scotland_summary <- scotland_summary %>%
  mutate(
    hover_text = paste(
      "Month:", month, "<br>",                     # Display the month
      "Year:", year, "<br>",                      # Display the year
      "Total Prescriptions:",                    # Display the total prescriptions in x10^5 format
      round(total_prescriptions / 1e5, 2), "x10^5"
    )
  )

# Create the base ggplot for visualizing SSRI prescriptions trends over months
p <- ggplot(scotland_summary, aes(
  x = month,                                     # Month on the x-axis
  y = total_prescriptions / 1e5,                 # Prescriptions scaled to x10^5 on the y-axis
  group = year,                                  # Group data by year for separate lines
  color = factor(year),                          # Different colors for each year
  text = hover_text                              # Add custom hover text for tooltips
)) +
  geom_line(linewidth = 1) +                     # Add lines to connect monthly points
  geom_point(size = 2, alpha = 0.8) +            # Add points to highlight monthly values
  scale_y_continuous(
    breaks = scales::pretty_breaks(n = 5),       # Set y-axis breaks for better readability
    labels = scales::label_number(accuracy = 0.1)  # Format y-axis labels with one decimal point
  ) +
  scale_color_manual(
    values = c("2022" = "#F7BCAC", "2023" = "#F37A48")  # Assign custom colors to each year
  ) +
  labs(
    title = "Monthly SSRI Prescriptions in Scotland",        # Add plot title
    subtitle = "Trends in SSRI prescriptions across Scotland over two years (2022-2023).",  # Add subtitle
    x = "Month",                                            # Label for x-axis
    y = "Total Prescriptions (x10^5)",                     # Label for y-axis
    color = "Year"                                         # Legend label for color
  ) +
  theme_minimal()                                          # Use a minimal theme for a clean look

# Convert the ggplot object to an interactive plot with tooltips
# The "tooltip" parameter specifies that the hover text should use the "text" aesthetic
interactive_plot <- ggplotly(p, session = "knitr", tooltip = "text")

# Display the interactive plot
interactive_plot

Figure 3.1: Monthly Plot

Figure 2 Analysis

Figure 2 highlights monthly variations in SSRI prescriptions, with peaks in March, August, and November:
- March Peak: A 5.94% increase in prescriptions compared to the yearly average (2023) reflects the lingering effects of winter’s reduced light exposure.
- August Spike: This unexpected peak may be driven by non-seasonal societal stressors, such as back-to-school anxiety and post-holiday adjustments (Jack et al., 2023).
- November Increase: The rise in November aligns with shorter days and the return to GMT, known triggers for SAD symptoms.

Calculation to find percentage increase:

Click to view code
# Filter data for the year 2023
# This isolates the dataset to only include records for 2023
data_2023 <- scotland_summary %>% 
  filter(year == 2023)

# Extract total prescriptions for March 2023
# This retrieves the total prescription value for the specific month (March) in 2023
march_2023 <- data_2023 %>% 
  filter(month == "Mar") %>%  # Filter for March
  pull(total_prescriptions)   # Extract the value of total prescriptions

# Calculate the average monthly prescriptions for 2023
# This computes the mean value of total prescriptions across all months in 2023
average_2023 <- mean(data_2023$total_prescriptions, na.rm = TRUE)  # Handle any missing values

# Calculate the percentage increase for March compared to the yearly average
# Formula: ((March value - average value) / average value) * 100
percentage_increase <- ((march_2023 - average_2023) / average_2023) * 100

# Print the result
# Display the calculated percentage increase with two decimal places
cat(sprintf("The percentage increase for March 2023 compared to the yearly average is %.2f%%.", percentage_increase))

3.0.3 Figure 3: Regional Analysis Maps

These maps display regional patterns in SSRI prescriptions, per-capita prescription rates, and average sunlight hours across Scotland. They reveal how geographic disparities, such as sunlight exposure, influence mental health outcomes and treatment trends.

Subfigures:
Map A: Average Sunshine Hours by Health Board
Map B: Total SSRI Prescriptions by Health Board
Map C: Average Prescriptions Per Capita by Health Board

Step 1: Wrangle Map Data
This block defines a helper function to aggregate metrics by health board and year, merge them with spatial data, and prepare the results for mapping.

# Step 1: Create a helper function to prepare map data
# This function aggregates a specified metric by health board and year, joins the results with spatial data, 
# and ensures the output is a spatial feature object for mapping.

prepare_map_data <- function(data, metric_column, new_column_name) {
  
  # Step 1.1: Aggregate the specified metric by health board and year
  # Group the data by health board name (hb_name) and year, then calculate the mean value of the specified metric.
  aggregated_data <- data %>%
    group_by(hb_name, year) %>%  # Group by health board and year for aggregation
    summarize(
      value = mean({{ metric_column }}, na.rm = TRUE),  # Calculate the mean value, ignoring missing values
      .groups = "drop"  # Drop grouping after summarization to avoid unintended behavior in subsequent operations
    )
  
  # Step 1.2: Rename the aggregated metric column
  # Dynamically rename the column for clarity, using the provided new_column_name argument.
  renamed_data <- aggregated_data %>%
    rename(!!new_column_name := value)  # Use `:=` for dynamic renaming based on the function's input
  
  # Step 1.3: Merge the aggregated data with spatial data
  # Join the aggregated data with NHS health board geometries to enable spatial mapping.
  joined_data <- renamed_data %>%
    left_join(NHS_healthboards, by = "hb_name")  # Ensure the "hb_name" column matches in both datasets
  
  # Step 1.4: Convert the result to a spatial feature (sf) object
  # Explicitly transform the joined data into an sf object, required for spatial visualization with ggplot2.
  spatial_data <- st_as_sf(joined_data)
  
  # Step 1.5: Return the processed spatial data
  return(spatial_data)
}

Step 2: Prepare Data for Mapping
Here, the prepare_map_data() function is applied to create datasets for average sunlight hours, total prescriptions, and prescriptions per capita.

# Each call aggregates a specific metric, joins it with spatial data, and prepares it for mapping.
# Prepare map data for average sunlight hours
# Aggregates sunshine hours by health board and year, calculating the average for mapping purposes.
sunlight_map_data <- prepare_map_data(
  final_data_with_geometry,  # Input dataset containing prescription and sunshine data
  sunshine_hours,            # Metric to aggregate: total sunshine hours
  "avg_sunshine_hours"       # Name of the new column for the aggregated data
)

# Prepare map data for total SSRI prescriptions
# Aggregates the total number of SSRI prescriptions by health board and year for visualization.
prescriptions_map_data <- prepare_map_data(
  final_data_with_geometry,  # Input dataset containing prescription data
  total_paid_items,          # Metric to aggregate: total paid items for prescriptions
  "total_prescriptions"      # Name of the new column for the aggregated data
)

# Prepare map data for per-capita SSRI prescriptions
# Aggregates per-capita prescription rates by health board and year for normalization.
per_capita_map_data <- prepare_map_data(
  final_data_with_geometry,  # Input dataset containing per-capita prescription rates
  prescriptions_per_capita,  # Metric to aggregate: prescriptions per capita
  "avg_prescriptions_per_capita"  # Name of the new column for the aggregated data
)

Step 3: Define the Map Creation Function
This function standardizes the process of creating maps by defining how data is visualized, labeled, and styled.

# This function takes spatial data and mapping parameters to produce a customized ggplot map.
create_map <- function(data, value_column, title, fill_label) {
  ggplot(data) +  # Initialize the ggplot object with the provided spatial data
    geom_sf(  # Add spatial features (polygons) to the plot
      aes_string(fill = value_column),  # Dynamically specify the fill aesthetic using the value_column
      color = "black"  # Set the border color for polygons
    ) +
    scale_fill_viridis_c(  # Apply a Viridis color scale for better visual accessibility
      option = "inferno",  # Use the "inferno" color palette
      name = fill_label    # Set the legend label to the provided fill_label
    ) +
    labs(  # Add titles and a data source caption
      title = title,  # Main title for the map
      caption = "Data sources: NHS Scotland and MET Office"  # Attribution for data sources
    ) +
    theme_minimal() +  # Apply a minimal theme for a clean appearance
    theme(
      plot.title = element_text(hjust = 0.5, face = "bold"),  # Center-align the title and make it bold
      legend.position = "right"  # Place the legend on the right side of the map
    ) +
    facet_wrap(~year)  # Create a separate map for each year in the dataset
}

Step 4: Generate Maps
This block uses the create_map() function to generate individual maps for each metric.

# Generate a map of average sunlight hours
# This map visualizes the average sunshine hours by health board and year
map_sunlight <- create_map(
  sunlight_map_data,            # Input spatial dataset for sunlight
  "avg_sunshine_hours",         # Column representing average sunshine hours
  "Average Sunlight Hours by Health Board",  # Map title
  "Avg. Sunlight Hours"         # Legend label
) + coord_sf(datum = NA)  # Remove coordinate reference system to avoid distortion

# Generate a map of total SSRI prescriptions
# This map visualizes the total number of SSRI prescriptions by health board and year
map_prescriptions <- create_map(
  prescriptions_map_data,       # Input spatial dataset for prescriptions
  "total_prescriptions",        # Column representing total prescriptions
  "Total Prescriptions by Health Board",  # Map title
  "Total Prescriptions"         # Legend label
) + coord_sf(datum = NA)  # Remove coordinate reference system to avoid distortion

# Generate a map of average per-capita SSRI prescriptions
# This map visualizes the average number of prescriptions per capita by health board and year
map_avg_per_capita <- create_map(
  per_capita_map_data,          # Input spatial dataset for per-capita prescriptions
  "avg_prescriptions_per_capita",  # Column representing prescriptions per capita
  "Average Prescriptions Per Capita by Health Board",  # Map title
  "Prescriptions Per Capita"    # Legend label
) + coord_sf(datum = NA)  # Remove coordinate reference system to avoid distortion

Step 5: Combine maps
The final step combines the individual maps into a single figure for comparison using plot_grid().

# Combine individual maps into a single figure
# This step arranges the three maps (sunlight, total prescriptions, per capita prescriptions)
# into one visual layout for easy comparison.

combined_maps <- plot_grid(
  map_sunlight,               # Map showing average sunlight hours
  map_prescriptions,          # Map showing total prescriptions
  map_avg_per_capita,         # Map showing per-capita prescriptions
  labels = c("A", "B", "C"),  # Add labels to identify each map (optional)
  ncol = 1                    # Arrange maps in a vertical layout; change to `ncol = 2` for horizontal layout
)

# Display the combined map layout
combined_maps

Figure 3 Analysis

The maps in Figure 3 reveal significant regional trends in sunlight exposure and SSRI prescriptions:

  • Map A: Average Sunlight Hours: Northern regions (e.g., Orkney, Shetland, and the Western Isles) report lower sunlight exposure, reflecting their latitude. This likely exacerbates SAD symptoms (Lansdall-Welfare et al., 2019).
  • Map B: Total Prescriptions: Urban centers like Greater Glasgow and Clyde exhibit the highest total prescription volumes, reflecting their larger populations and access to healthcare infrastructure.
  • Map C: Per-Capita Prescriptions: High per-capita prescription rates in rural northern areas underscore the compounded challenges of low sunlight exposure and limited mental health resources.

These findings highlight the need for targeted mental health strategies, particularly in northern and rural regions.

4 Discussion

4.1 Synthesis of Key Findings

Temporal Trends

As shown in Figure 2, monthly SSRI prescriptions reveal distinct peaks in March, August, and November:

  • Spring Lag Effect (March): The peak in March supports the hypothesis of a delayed psychological response to winter’s reduced sunlight. Research suggests that sunlight deprivation effects can linger for months, with individuals often seeking treatment later in the season (Eagles et al., 1998; Lansdall-Welfare et al., 2019).
  • Unexpected August Spike: Contrary to seasonal expectations, the August peak may reflect non-seasonal societal stressors, such as back-to-school anxiety and post-holiday transitions (Jack et al., 2023).
  • Winter Preparatory Prescriptions (November): The increase in November aligns with the onset of shorter days and the return to GMT, which are known triggers for SAD symptoms.

Notably, the March peak challenges traditional assumptions that winter is the primary treatment period for SAD, suggesting extended seasonal effects and potential delays in symptom recognition or treatment-seeking behavior.

Regional Variability

The geographical patterns in Figure 3 highlight disparities in sunlight exposure and prescription trends:

  • High Per-Capita Prescriptions in Northern Regions: Areas like Orkney, Shetland, and the Western Isles report disproportionately high per-capita prescriptions due to extreme seasonal light fluctuations and limited access to alternative treatments.
  • Urban vs. Rural Dynamics: While urban centers such as Greater Glasgow and Clyde show higher total prescriptions, these regions report lower per-capita rates, reflecting the availability of diverse mental health resources beyond medication.

These findings underscore the need for region-specific interventions, particularly in high-risk northern and rural areas.

Relationship Between Prescriptions and Seasonal Patterns

As illustrated in Figure 1, SSRI prescriptions deviate from sunshine hours, which follow predictable seasonal patterns. While sunshine hours peak in summer and dip in winter:

  • Prescriptions peak in March and November, coinciding with transitional periods in daylight.
  • The summer anomaly in August highlights the role of non-seasonal drivers, including societal stressors.

This decoupling of sunlight and prescription trends suggests that SAD-related treatment-seeking behavior is influenced by a combination of environmental and societal factors.

4.2 Limitations and Future Directions

Limitations
1 Data Specificity: The absence of diagnostic data limits this study to general SSRI usage trends rather than SAD-specific patterns.
2. Confounding Variables: Factors such as socioeconomic conditions, healthcare accessibility, and cultural differences were not included but could significantly influence prescription trends.
3. Aggregation Levels: Health board-level data may obscure localized patterns, such as those occurring at the community or GP level.

Future Research
To address these limitations, future studies should: - Incorporate Diagnostic Data: Linking SSRI prescriptions to SAD diagnoses could provide clearer insights into treatment patterns.
- Explore Non-Seasonal Drivers: Investigate societal stressors, like economic pressures or healthcare access, which may contribute to unexpected trends like the August spike.
- Examine Alternative Treatments: Assess the effectiveness of light therapy or counseling services and their availability in high-risk regions.


💡 Policy Implication: Increasing access to light therapy in northern and rural areas could mitigate the over-reliance on SSRIs, particularly during peak seasons.


5 Conclusion

This analysis highlights significant temporal and regional variability in SSRI prescriptions across Scotland:

  • Temporal Trends: Peaks in March, August, and November suggest a complex interaction between environmental factors (e.g., sunlight) and societal drivers (e.g., stress).
  • Regional Disparities: Higher per-capita prescriptions in northern regions reflect compounded challenges of low sunlight exposure and limited access to alternative mental health resources.

These findings challenge traditional assumptions about SAD-related treatment and underscore the need for regionally tailored interventions. Expanding access to light therapy and promoting awareness campaigns during late winter and early spring could improve the management of seasonal mood disorders, reducing their burden on public health.

🌍 Broader Impact: By addressing these disparities, policymakers can improve mental health outcomes across diverse populations, ensuring equitable access to treatment for those most affected by seasonal mood disorders.

7 AI Disclaimer

During the development of this report, I utilised AI as a tool to enhance both the analytical process and the presentation of the findings. Specifically:
Warnings and Error Resolution: AI tools helped me to understand, so I could troubleshoot, warnings and errors encountered during data processing. This ensured that the analysis pipeline ran smoothly and efficiently.
File Integration and Parsing: Guidance was provided by AI to integrate the MET hours text file into the workflow, as this was a type of file I had not encountered before. This included parsing the large datasets and automating the cleaning process for improved efficiency.
CSS File Creation and Optimization: AI was used to help me understand how to design and refine a custom CSS file that tailored the report for both digital and print outputs. Adjustments included:
1. General font sizing and line spacing for readability.
2. Compact and professional styling for code blocks, figures, and headers.
3. Print-specific optimizations, such as hiding the table of contents (TOC) and managing spacing for a more polished and efficient print layout.
4. Ensuring code blocks could split across pages without disrupting readability.